Multi-party Language Interaction in a Fast-Paced Game Using Multi-keyword Spotting

نویسندگان

  • Jill Fain Lehman
  • Nikolas Wolfe
  • André Pereira
چکیده

Existing speech technology tends to be poorly suited for young children at play, both because of their age-specific pronunciation and because they tend to play together, making overlapping speech and side discussions about the play itself ubiquitous. We report the performance of an autonomous, multi-keyword spotter that has been trained and tested on data from a multi-player game designed to focus on these issues. In Mole Madness, children laugh, yell, speak at the same time, make side comments and even invent their own forms of keywords to control a virtual on-screen character. Within this challenging language environment, the system achieves 94% overall recall and 85% overall accuracy, providing child-child and child-robot pairs with responsive play in a rapid-paced game.This technology can enable others to create novel multi-party interactions for entertainment where a limited number of keywords has to be recognized.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Keyword spotting in multi-player voice driven games for children

Word spotting, or keyword identification, is a highly challenging task when there are multiple speakers speaking simultaneously. In the case of a game being controlled by children solely through voice, the task becomes extremely difficult. Children, unlike adults, typically do not await their turn to speak in an orderly fashion. They interrupt and shout at arbitrary times, speak or say things t...

متن کامل

Robust Multi-Keyword Spotting of Telephone Speech Using Stochastic Matching

In telephone speech recognition, the acoustic mismatch between the training and the test environment often causes severe degradation due to the channel distortion and ambient noise. In this paper, a two-level codebook-based stochastic matching (CBSM) is proposed to deal with the acoustic mismatch. For multi-keyword detection, we define a keyword relation table and a weighting function for reaso...

متن کامل

Language independent and unsupervised acoustic models for speech recognition and keyword spotting

Developing high-performance speech processing systems for low-resource languages is very challenging. One approach to address the lack of resources is to make use of data from multiple languages. A popular direction in recent years is to train a multi-language bottleneck DNN. Language dependent and/or multi-language (all training languages) Tandem acoustic models are then trained. This work con...

متن کامل

Multi-keyword spotting of telephone speech using a fuzzy search algorithm and keyword-driven two-level CBSM

In telephone speech recognition, the acoustic mismatch between training and testing environments often causes a severe degradation in the recognition performance. This paper presents a keyword-driven two-level codebook-based stochastic matching (CBSM) algorithm to eliminate the acoustic mismatch. Additionally, in Mandarin speech, it is dicult to correctly recognize the unvoiced part in a sylla...

متن کامل

Robust Keyword Spotting Using a Multi-Stream Approach

Speech recognition systems are prone to severe degradation in noisy environments due to mismatch between training and testing conditions. A multi-stream approach for keyword spotting is proposed to improve robustness in mismatched conditions. The assumption is that most real world noises are colored and do not affect the full spectrum equally, meaning certain parts of the spectrum can still pro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016